Facial Expression Recognition Based Music Recommendation System Using Transfer Learning with Xception

Authors: Mohit Singh Mahara, Manan Verma, Niskam Chaudhary, Mohd Zaid Ali Khan, Rudresh Kaushik

DOI Link: https://doi.org/10.22214/ijraset.2026.81815

Abstract

Emotion-aware recommender system is achieving importance in human -computer interaction. Facial expression recognition (FER) provides an effective way to detect human emotion using computer vision technique. This research gives an intelligent music recommendation system that identify human emotion and recommend music using Spotify Api. This model uses transfer learning approach with pre-trained Xception convolutional neural network. The system is implement using Python and its libraries NumPy and Pandas, Spotify Api and OpenCV. The model is trained on FER2013 Balanced dataset available on Kaggle. The model classifies emotions into seven categories happy, sad, neutral, fear, surprise, angry and disgust. Xception based approach achieve higher accuracy compere to traditional machine learning algorithm: Support Vector Machine (SVM), k-Nearest Neighbors (KNN), Random Forest (RF), and Naïve Bayes (NB). The model achieves 92.4% accuracy which demonstration the effectiveness of the transfer learning for music’s recommendation system.

Introduction

The text describes a Facial Expression Recognition (FER) based Music Recommendation System that uses deep learning to identify human emotions from facial images and recommend suitable music.

Emotion recognition is important for communication and human-computer interaction. The system classifies facial expressions into seven categories: happy, sad, angry, fear, surprise, neutral, and disgust. Traditional machine learning methods like SVM, KNN, and Random Forest rely on manual feature extraction and perform poorly on complex real-world facial data. In contrast, deep learning—especially CNN-based models—automatically learns features and performs better.

The system uses the FER2013 dataset, which contains around 35,000 grayscale facial images. Because the dataset has variations in lighting, pose, and imbalance issues, preprocessing techniques like normalization, resizing, augmentation, and face detection are applied.

The proposed model uses transfer learning with the Xception architecture, a deep CNN that uses depthwise separable convolutions for efficient and accurate feature extraction. The system captures images in real time using a webcam, detects the face using OpenCV, preprocesses the image, and classifies emotion using the Xception model.

After emotion detection, the system maps the recognized emotion to suitable music categories. It then uses the Spotify API to recommend songs that match the user’s mood (e.g., happy → energetic music, sad → relaxing music). This creates a personalized, emotion-aware music recommendation experience.

The literature review shows that while traditional ML methods and basic CNNs have been widely used, they struggle with accuracy and generalization. Advanced models like Xception improve performance significantly but are underexplored in music recommendation systems.

Conclusion

This research present facial expression recognition-based music recommendation system using the Xception transfer learning model. The system successful detects the user emotion form facial images and recommend music using Spotify Api. Experimental result demonstrate that the Xception-based model outperforms the traditional machine learning algorithm achieving accuracy of 92.4% on FER 2013 dataset.

References

[1] I. Goodfellow, D. Erhan, P. Carrier, A. Courville, M. Mirza, B. Hamner, W. Cukierski, Y. Tang, D. Thaler and Y. Bengio, “Challenges in representation learning: A report on the facial expression recognition challenge,” arXiv preprint arXiv:1307.0414, 2013. [2] C. Pramerdorfer and M. Kampel, “Facial expression recognition using convolutional neural networks: State of the art,” IEEE Transactions on Affective Computing, vol. 10, no. 3, pp. 1–12, 2019. [3] Y. Khaireddin and Z. Chen, “Facial expression recognition: State of the art performance on FER2013,” arXiv preprint arXiv:2105.03588, 2021. [4] H. Shin, J. Kim and S. Lee, “Baseline CNN structure analysis for facial expression recognition,” Proceedings of the IEEE International Conference on Image Processing, pp. 1–5, 2016. [5] A. Khanzada, M. Ali and S. Shah, “Deep learning based facial expression recognition using FER2013 dataset,” Stanford CS230 Project Report, 2020. [6] M. Rashad, M. Nassar and A. Wahdan, “FERDCNN: Efficient facial emotion recognition using deep convolutional neural networks,” IEEE Access, vol. 12, pp. 1–12, 2024. [7] A. Khan, S. Khan and M. Ahmed, “Facial emotion recognition using machine learning techniques,” Information, vol. 13, no. 6, pp. 268–279, 2022. [8] E. Agung, R. Prabowo and D. Wibowo, “Image-based facial emotion recognition using convolutional neural networks,” Scientific Reports, vol. 14, pp. 1–12, 2024. [9] N. Yalçin and M. Sezgin, “Improving facial emotion recognition performance using balanced datasets,” Heliyon, vol. 10, no. 3, pp. 1–10, 2024. [10] O. C. Oguine, K. J. Oguine and H. I. Bisallah, “Hybrid facial expression recognition model using DCNN and Haar cascade,” IEEE Conference on Artificial Intelligence, pp. 45–50, 2022. [11] C. Dewi, R. Chen and J. Lin, “Real-time facial emotion recognition: A survey,” International Journal of Computer Vision and Applications, vol. 15, no. 2, pp. 101–115, 2023. [12] E. Dada and A. Joseph, “Facial emotion recognition using deep learning techniques,” Journal of Computer Vision and Image Processing, vol. 9, no. 4, pp. 55–66, 2023. [13] I. Dagher, A. El-Shehaby and M. Al-Ayyoub, “Facial expression recognition using support vector machines,” Visual Computing for Industry, Biomedicine and Art, vol. 2, no. 1, pp. 1–10, 2019. [14] Y. Li, H. Wang and X. Zhang, “Analysis of machine learning algorithms for facial emotion recognition,” Algorithms, vol. 18, no. 12, pp. 1–16, 2025. [15] FER2013 Dataset, Kaggle Dataset Repository, 2013. [16] T. B. Kurniawan, “Music recommendation based on facial expression recognition,” Journal of Information and Visualization, vol. 6, no. 2, pp. 120–130, 2025. (Joiv) [17] G. Arulselvi, R. Gokul and R. Sasikumar, “Music recommendation system based on facial expression,” International Research Journal on Advanced Engineering and Management, vol. 3, no. 4, pp. 1496–1500, 2025. (ResearchGate) [18] L. Zhao, B. Bakariya and A. Singh, “Emotion-driven music recommendation system using deep learning,” Evolutionary Systems, vol. 15, no. 2, pp. 641–658, 2024. (ScienceDirect) [19] M. Parashakthi and S. Savithri, “Facial emotion recognition-based music recommendation system,” International Journal of Health Sciences, vol. 6, no. 4, pp. 5829–5835, 2022. (ResearchGate) [20] S. Metilda Florence and M. Uma, “Emotional detection and music recommendation system based on user facial expression,” IOP Conference Series: Materials Science and Engineering, vol. 912, pp. 1–8, 2020. (ResearchGate) [21] H. Nguyen and T. Nguyen, “A model for song recommendation based on facial emotion recognition,” International Journal of Intelligent Networks and Systems, pp. 1–10, 2024. (INASS) [22] S. Singh, R. Kumar and P. Verma, “Emotion-aware music recommendation system using deep learning,” International Journal of Performance Engineering, vol. 21, no. 6, pp. 326–331, 2025. (ijpe-online.com) [23] V. Vijayalakshmi and K. Natarajan, “Facial expression-based AI system for personalized music recommendation,” Proceedings of the International Conference on Intelligent Systems, pp. 150–156, 2025. (Atlantis Press) [24] S. Sana and M. Khan, “Facial emotion recognition-based music system using CNN,” Procedia Engineering, vol. 235, pp. 310–317, 2022. (ScienceDirect) [25] A. Kothari, A. Garg and B. R., “Deep learning approach to emotion recognition in music,” IEEE International Conference on Computational Intelligence, pp. 45–50, 2021. [26] T. Malik, S. Adavanne and T. Virtanen, “Stacked convolutional and recurrent neural networks for music emotion recognition,” IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 1–5, 2017. [27] R. Mammadli, H. Bilgin and A. Karaca, “Music recommendation system based on emotion, age and ethnicity,” arXiv preprint arXiv:2212.04782, 2022. [28] S. Shalini, P. Kumar and V. Singh, “Facial emotion-based music recommendation system using machine learning,” Turkish Journal of Computer and Mathematics Education, vol. 12, no. 1, pp. 912–917, 2019. [29] D. Kumar and R. Sharma, “Emotion-based music recommendation using artificial intelligence,” International Journal of Research in Computer Science, vol. 14, no. 2, pp. 101–108, 2023. [30] A. Gupta and P. Verma, “Emotion-aware human–computer interaction using facial expression recognition,” IEEE Transactions on Affective Computing, vol. 14, no. 2, pp. 345–356, 2024.

Copyright

Copyright © 2026 Mohit Singh Mahara, Manan Verma, Niskam Chaudhary, Mohd Zaid Ali Khan, Rudresh Kaushik. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET81815

Publish Date : 2026-05-02

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here